智能论文笔记

Automated discovery of interpretable gravitational-wave population models

Kaze W. K Wong , Miles Cranmer

分类：机器学习

2022-07-25

我们提出了一种自动方法，可以从数据中发现重力波（GW）事件的分析人群模型。随着检测到更多引力波（GW）事件，由于其表现性，诸如高斯混合模型之类的柔性模型（例如高斯混合模型）变得越来越重要。但是，灵活的模型带有许多缺乏身体动机的参数，从而解释了这些模型的含义。在这项工作中，我们证明了符号回归可以通过将这种柔性模型的后验预测分布提炼成可解释的分析表达式来补充灵活模型。我们恢复了常见的GW人群模型，例如Power-Law-Plus-Gaussis，并找到了一种结合准确性和简单性的新经验人群模型。这表明了一种在不断增长的GW目录中自动发现可解释的人群模型的策略，该模型可能会应用于其他天体物理现象。

translated by 谷歌翻译

The CAMELS project: public data release

Francisco Villaescusa-Navarro , Shy Genel , Daniel Anglés-Alcázar , Lucia A. Perez , Pablo Villanueva-Domingo , Digvijay Wadekar , Helen Shao , Faizan G. Mohammad , Sultan Hassan , Emily Moser

分类：人工智能 | 机器学习

2022-01-04

制定了具有机器学习模拟（骆驼）项目的宇宙学和天体物理学，通过数千名宇宙的流体动力模拟和机器学习将宇宙学与天体物理学结合起来。骆驼包含4,233个宇宙学仿真，2,049个n-body和2,184个最先进的流体动力模拟，在参数空间中采样巨大的体积。在本文中，我们介绍了骆驼公共数据发布，描述了骆驼模拟的特性和由它们产生的各种数据产品，包括光环，次麦，银河系和空隙目录，功率谱，Bispectra，Lyman - $ \ Alpha $光谱，概率分布函数，光环径向轮廓和X射线光子列表。我们还释放了超过骆驼 - 山姆的数十亿个星系的目录：与Santa Cruz半分析模型相结合的大量N身体模拟。我们释放包含350多个Terabytes的所有数据，并包含143,922个快照，数百万光环，星系和摘要统计数据。我们提供有关如何访问，下载，读取和处理数据AT \ URL {https://camels.readthedocs.io}的进一步技术详细信息。

translated by 谷歌翻译

Projected State-action Balancing Weights for Offline Reinforcement Learning

Jiayi Wang , Zhengling Qi , Raymond K. W. Wong

分类：机器学习

2021-09-10

离线政策评估（OPE）被认为是强化学习（RL）的基本且具有挑战性的问题。本文重点介绍了基于从无限 - 马尔可夫决策过程的框架下从可能不同策略生成的预收集的数据的目标策略的价值估计。由RL最近开发的边际重要性采样方法和因果推理中的协变量平衡思想的动机，我们提出了一个新颖的估计器，具有大约投影的国家行动平衡权重，以进行策略价值估计。我们获得了这些权重的收敛速率，并表明拟议的值估计量在技术条件下是半参数有效的。就渐近学而言，我们的结果比例均以每个轨迹的轨迹数量和决策点的数量进行扩展。因此，当决策点数量分歧时，仍然可以使用有限的受试者实现一致性。此外，我们开发了一个必要且充分的条件，以建立贝尔曼操作员在政策环境中的适当性，这表征了OPE的困难，并且可能具有独立的利益。数值实验证明了我们提出的估计量的有希望的性能。

translated by 谷歌翻译

Hungry Hungry Hippos: Towards Language Modeling with State Space Models

Tri Dao , Daniel Y. Fu , Khaled K. Saab , Armin W. Thomas , Atri Rudra , Christopher Ré

分类：机器学习 | 自然语言处理

2022-12-28

State space models (SSMs) have demonstrated state-of-the-art sequence modeling performance in some modalities, but underperform attention in language modeling. Moreover, despite scaling nearly linearly in sequence length instead of quadratically, SSMs are still slower than Transformers due to poor hardware utilization. In this paper, we make progress on understanding the expressivity gap between SSMs and attention in language modeling, and on reducing the hardware barrier between SSMs and attention. First, we use synthetic language modeling tasks to understand the gap between SSMs and attention. We find that existing SSMs struggle with two capabilities: recalling earlier tokens in the sequence and comparing tokens across the sequence. To understand the impact on language modeling, we propose a new SSM layer, H3, that is explicitly designed for these abilities. H3 matches attention on the synthetic languages and comes within 0.4 PPL of Transformers on OpenWebText. Furthermore, a hybrid 125M-parameter H3-attention model that retains two attention layers surprisingly outperforms Transformers on OpenWebText by 1.0 PPL. Next, to improve the efficiency of training SSMs on modern hardware, we propose FlashConv. FlashConv uses a fused block FFT algorithm to improve efficiency on sequences up to 8K, and introduces a novel state passing algorithm that exploits the recurrent properties of SSMs to scale to longer sequences. FlashConv yields 2$\times$ speedup on the long-range arena benchmark and allows hybrid language models to generate text 1.6$\times$ faster than Transformers. Using FlashConv, we scale hybrid H3-attention language models up to 1.3B parameters on the Pile and find promising initial results, achieving lower perplexity than Transformers and outperforming Transformers in zero- and few-shot learning on a majority of tasks in the SuperGLUE benchmark.

translated by 谷歌翻译

Semi-Siamese Network for Robust Change Detection Across Different Domains with Applications to 3D Printing

Yushuo Niu , Ethan Chadwick , Anson W. K. Ma , Qian Yang

分类：计算机视觉

2022-12-16

Automatic defect detection for 3D printing processes, which shares many characteristics with change detection problems, is a vital step for quality control of 3D printed products. However, there are some critical challenges in the current state of practice. First, existing methods for computer vision-based process monitoring typically work well only under specific camera viewpoints and lighting situations, requiring expensive pre-processing, alignment, and camera setups. Second, many defect detection techniques are specific to pre-defined defect patterns and/or print schematics. In this work, we approach the automatic defect detection problem differently using a novel Semi-Siamese deep learning model that directly compares a reference schematic of the desired print and a camera image of the achieved print. The model then solves an image segmentation problem, identifying the locations of defects with respect to the reference frame. Unlike most change detection problems, our model is specially developed to handle images coming from different domains and is robust against perturbations in the imaging setup such as camera angle and illumination. Defect localization predictions were made in 2.75 seconds per layer using a standard MacBookPro, which is comparable to the typical tens of seconds or less for printing a single layer on an inkjet-based 3D printer, while achieving an F1-score of more than 0.9.

translated by 谷歌翻译

YoloCurvSeg: You Only Label One Noisy Skeleton for Vessel-style Curvilinear Structure Segmentation

Li Lin , Linkai Peng , Huaqing He , Pujin Cheng , Jiewei Wu , Kenneth K. Y. Wong , Xiaoying Tang

分类：计算机视觉

2022-12-11

Weakly-supervised learning (WSL) has been proposed to alleviate the conflict between data annotation cost and model performance through employing sparsely-grained (i.e., point-, box-, scribble-wise) supervision and has shown promising performance, particularly in the image segmentation field. However, it is still a very challenging problem due to the limited supervision, especially when only a small number of labeled samples are available. Additionally, almost all existing WSL segmentation methods are designed for star-convex structures which are very different from curvilinear structures such as vessels and nerves. In this paper, we propose a novel sparsely annotated segmentation framework for curvilinear structures, named YoloCurvSeg, based on image synthesis. A background generator delivers image backgrounds that closely match real distributions through inpainting dilated skeletons. The extracted backgrounds are then combined with randomly emulated curves generated by a Space Colonization Algorithm-based foreground generator and through a multilayer patch-wise contrastive learning synthesizer. In this way, a synthetic dataset with both images and curve segmentation labels is obtained, at the cost of only one or a few noisy skeleton annotations. Finally, a segmenter is trained with the generated dataset and possibly an unlabeled dataset. The proposed YoloCurvSeg is evaluated on four publicly available datasets (OCTA500, CORN, DRIVE and CHASEDB1) and the results show that YoloCurvSeg outperforms state-of-the-art WSL segmentation methods by large margins. With only one noisy skeleton annotation (respectively 0.14%, 0.02%, 1.4%, and 0.65% of the full annotation), YoloCurvSeg achieves more than 97% of the fully-supervised performance on each dataset. Code and datasets will be released at https://github.com/llmir/YoloCurvSeg.

translated by 谷歌翻译

Normalizing Flows for Hierarchical Bayesian Analysis: A Gravitational Wave Population Study

David Ruhe , Kaze Wong , Miles Cranmer , Patrick Forré

分类：机器学习

2022-11-15

We propose parameterizing the population distribution of the gravitational wave population modeling framework (Hierarchical Bayesian Analysis) with a normalizing flow. We first demonstrate the merit of this method on illustrative experiments and then analyze four parameters of the latest LIGO/Virgo data release: primary mass, secondary mass, redshift, and effective spin. Our results show that despite the small and notoriously noisy dataset, the posterior predictive distributions (assuming a prior over the parameters of the flow) of the observed gravitational wave population recover structure that agrees with robust previous phenomenological modeling results while being less susceptible to biases introduced by less-flexible distribution models. Therefore, the method forms a promising flexible, reliable replacement for population inference distributions, even when data is highly noisy.

translated by 谷歌翻译

A direct time-of-flight image sensor with in-pixel surface detection and dynamic vision

Istvan Gyongy , Ahmet T. Erdogan , Neale A. W. Dutton , Germán Mora Martín , Alistair Gorman , Hanning Mai , Francesco Mattioli Della Rocca , Robert K. Henderson

分类：计算机视觉

2022-09-23

3D Flash LiDAR是传统扫描激光雷达系统的替代方法，有望在紧凑的外形尺寸中进行精确的深度成像，并且没有运动部件，例如自动驾驶汽车，机器人技术和增强现实（AR）等应用。通常在图像传感器格式中使用单光子，直接飞行时间（DTOF）接收器实施，设备的操作可能会受到需要在室外场景中处理和压缩的大量光子事件的阻碍以及对较大数组的可扩展性。我们在这里提出了一个64x32像素（256x128 spad）DTOF成像器，该成像器通过将像素与嵌入式直方图使用像素一起克服这些局限性，该直方直方图锁定并跟踪返回信号。这大大降低了输出数据帧的大小，可在10 kfps范围内或100 kfps的最大帧速率进行直接深度读数。该传感器可选择性地读数检测表面或传感运动的像素，从而减少功耗和片外处理要求。我们演示了传感器在中端激光雷达中的应用。

translated by 谷歌翻译

Data efficient reinforcement learning and adaptive optimal perimeter control of network traffic dynamics

C. Chen , Y. P. Huang , W. H. K. Lam , T. L. Pan , S. C. Hsu , A. Sumalee , R. X. Zhong

分类：机器学习

2022-09-13

现有的数据驱动和反馈流量控制策略不考虑实时数据测量的异质性。此外，对于缺乏数据效率，传统的加固学习方法（RL）方法通常会缓慢收敛。此外，常规的最佳外围控制方案需要对系统动力学的精确了解，因此对内源性不确定性会很脆弱。为了应对这些挑战，这项工作提出了一种基于不可或缺的增强学习（IRL）的方法来学习宏观交通动态，以进行自适应最佳周边控制。这项工作为运输文献做出了以下主要贡献：（a）开发连续的时间控制，并具有离散增益更新以适应离散时间传感器数据。（b）为了降低采样复杂性并更有效地使用可用数据，将体验重播（ER）技术引入IRL算法。（c）所提出的方法以“无模型”方式放松模型校准的要求，该方式可以稳健地进行建模不确定性，并通过数据驱动的RL算法增强实时性能。（d）通过Lyapunov理论证明了基于IRL的算法和受控交通动力学的稳定性的收敛性。最佳控制定律被参数化，然后通过神经网络（NN）近似，从而缓解计算复杂性。在不需要模型线性化的同时，考虑了状态和输入约束。提出了数值示例和仿真实验，以验证所提出方法的有效性和效率。

translated by 谷歌翻译

Graph Neural Networks for Low-Energy Event Classification & Reconstruction in IceCube

R. Abbasi , M. Ackermann , J. Adams , N. Aggarwal , J. A. Aguilar , M. Ahlers , M. Ahrens , J. M. Alameddine , A. A. Alves Jr. , N. M. Amin

分类：机器学习

2022-09-07

ICECUBE是一种用于检测1 GEV和1 PEV之间大气和天体中微子的光学传感器的立方公斤阵列，该阵列已部署1.45 km至2.45 km的南极的冰盖表面以下1.45 km至2.45 km。来自ICE探测器的事件的分类和重建在ICeCube数据分析中起着核心作用。重建和分类事件是一个挑战，这是由于探测器的几何形状，不均匀的散射和冰中光的吸收，并且低于100 GEV的光，每个事件产生的信号光子数量相对较少。为了应对这一挑战，可以将ICECUBE事件表示为点云图形，并将图形神经网络（GNN）作为分类和重建方法。 GNN能够将中微子事件与宇宙射线背景区分开，对不同的中微子事件类型进行分类，并重建沉积的能量，方向和相互作用顶点。基于仿真，我们提供了1-100 GEV能量范围的比较与当前ICECUBE分析中使用的当前最新最大似然技术，包括已知系统不确定性的影响。对于中微子事件分类，与当前的IceCube方法相比，GNN以固定的假阳性速率（FPR）提高了信号效率的18％。另外，GNN在固定信号效率下将FPR的降低超过8（低于半百分比）。对于能源，方向和相互作用顶点的重建，与当前最大似然技术相比，分辨率平均提高了13％-20％。当在GPU上运行时，GNN能够以几乎是2.7 kHz的中位数ICECUBE触发速率的速率处理ICECUBE事件，这打开了在在线搜索瞬态事件中使用低能量中微子的可能性。

translated by 谷歌翻译